Predicting Students Progression Using Existing University Datasets: A Random Forest Application
نویسندگان
چکیده
This paper proposes the use of data available at Manchester Metropolitan University to assess the variables that can best predict student progression. We combine Virtual Learning Environment and MIS student records data sets and apply the Random Forest (RF) algorithm to ascertain which variables can best predict students’ progression (students satisfactorily completing one year and passing to the next or graduating). RF was deemed useful in this case because of the large amount of data available for analysis. The paper reports on the initial findings for data available in the period 2007-08. Results seem to indicate that variables such as students’ time of day usage, the last time students access the VLE and the number of document hits by staff, are the best predictors of student progression. The paper contributes to VLE evaluation and highlights the usefulness of a technique initially developed in the field of biology in an educational environment.
منابع مشابه
Comparison of Random Forest and Logistic Regression Methods in Predicting Mortality in Colorectal Cancer Patients and its Related Factors
Background and Objectives: The purpose of this study was to predict the mortality rate of colorectal cancer in Iranian patients and determine the effective factors on the mortality of patients with colorectal cancer using random forest and logistic regression methods. Methods: Data from 304 patients with colorectal cancer registry from the Gastroenterology and Liver Research Center of Shah...
متن کاملPredicting University Students' Academic Success and Choice of Major using Random Forests
In this paper, a large data set containing every course taken by every undergraduate student in a major university in Canada over 10 years is analyzed. Modern machine learning algorithms can use large data sets to build useful tools for the data provider, in this case, the university. In this article, two classifiers are constructed using random forests. To begin, the first two semesters of cou...
متن کاملComparison of Random Survival Forests for Competing Risks and Regression Models in Determining Mortality Risk Factors in Breast Cancer Patients in Mahdieh Center, Hamedan, Iran
Introduction: Breast cancer is one of the most common cancers among women worldwide. Patients with cancer may die due to disease progression or other types of events. These different event types are called competing risks. This study aimed to determine the factors affecting the survival of patients with breast cancer using three different approaches: cause-specific hazards regression, subdistri...
متن کاملClassification of genome data using Random Forest Algorithm: Review
Random Forest is a popular machine learning tool for classification of large datasets. The Dataset classified with Random Forest Algorithm (RF) are correlated and the interaction between the features leads to the study of genome interaction. The review is about RF with respect to its variable selection property which reduces the large datasets into relevant samples and predicting the accuracy f...
متن کاملPredicting disease progression in amyotrophic lateral sclerosis
OBJECTIVE It is essential to develop predictive algorithms for Amyotrophic Lateral Sclerosis (ALS) disease progression to allow for efficient clinical trials and patient care. The best existing predictive models rely on several months of baseline data and have only been validated in clinical trial research datasets. We asked whether a model developed using clinical research patient data could b...
متن کامل